Structural compression for document analysis
نویسندگان
چکیده
In this paper we describe a structural compression technique to be used for document text image storage and retrieval. The primary objective is to provide an eecient representation, storage, transmission and display. A secondary objective is to provide an encoding which allows access to speciied regions within the image and facilitates traditional document processing operations without requiring complete decoding. We describe an algorithm which symbolically decomposes a document image and structurally orders the error bitmap based on a probabilistic model. The resultant symbol and error representations lend themseleves to reasonably high compression ratios and are structured so as to allow operations directly on the compressed image. The compression scheme is implemented and compared to traditional compression methods.
منابع مشابه
Proceedings of the International Conference on Pattern Recognition , volume C , pages 664 - 668 , 1996 Structural Compression for Document
In this paper we describe a structural compression technique to be used for document text image storage and retrieval. The primary objective is to provide an eecient representation, storage, transmission and display. A secondary objective is to provide an encoding which allows access to speciied regions within the image and facilitates traditional document processing operations without requirin...
متن کاملDocument Image Compression and Analysis Your Full Name
Title of Dissertation: Your Dissertation Title Your Full Name, Doctor of Philosophy, 1997 Dissertation directed by: Academic title and name of advisor Department of Mathematics Image compression usually considers the minimization of storage space as its main objective. It is desirable, however, to code images so that we have the ability to process the resulting representation directly. In this ...
متن کاملGroup 4 Compressed Document Matching
Numerous approaches, including textual, structural and featural, for detecting duplicate documents have been investigated. Considering document images are usually stored and transmitted in compressed forms, it is advantageous to perform document matching directly on the compressed data. A two-stage process for matching Group 4 compressed document images is presented. In the coarse matching stag...
متن کاملProceedings of the International Conference on Image Processing , 1996 STRUCTURE - PRESERVING DOCUMENT IMAGE COMPRESSIONOmid
Maintaining a document in image form is often preferable in order to avoid the high cost of manual conversion or the introduction of large numbers of errors by automatic OCR and/or graphics interpretation. The large volume of data in the image can be greatly reduced by using compression techniques. Text-intensive document images typically have a great deal of redundancy in the bitmap representa...
متن کاملStructure-preserving document image compression
Maintaining a document in image form is often preferable in order to avoid the high cost of manual conversion or the introduction of large numbers of errors by automatic OCR and/or graphics interpretation. The large volume of data in the image can be greatly reduced by using compression techniques. Text-intensive document images typically have a great deal of redundancy in the bitmap representa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996